Experiments with Annotating Discourse Relations in the Hindi Discourse Relation Bank
نویسندگان
چکیده
In the Hindi Discourse Relation Bank (HDRB) project, we are developing a large corpus annotated with discourse relations, such as causal, temporal, contrastive and conjunctive relations. Adopting the lexically grounded approach of the Penn Discourse Treebank (PDTB), we annotate the argument structure of both explicit and implicit discourse relations, as well as the senses of relations. We describe our initial annotation experiments, which have led to the discovery of additional connective classes and the development of a modified sense classification scheme. We also present some distributional results from our initial annotations, and propose some insightful cross-linguistic generalizations by comparisons with the discourse relation distributions of English texts in the PDTB. Finally, we present an additional study of the properties of some connectives that belong to the class of discourse adverbials.
منابع مشابه
Assessment of Different Workflow Strategies for Annotating Discourse Relations: A Case Study with HDRB
In this paper we present our experiments with different annotation workflows for annotating discourse relations in the Hindi Discourse Relation Bank(HDRB). In view of the growing interest in the development of discourse data-banks based on the PDTB framework and the complexity associated with the discourse annotation, it is important to study and analyze approaches and practices followed in the...
متن کاملEvaluation of Discourse Relation Annotation in the Hindi Discourse Relation Bank
We describe our experiments on evaluating recently proposed modifications to the discourse relation annotation scheme of the Penn Discourse Treebank (PDTB), in the context of annotating discourse relations in Hindi Discourse Relation Bank (HDRB). While the proposed modifications were driven by the desire to introduce greater conceptual clarity in the PDTB scheme and to facilitate better annotat...
متن کاملThe Hindi Discourse Relation Bank
We describe the Hindi Discourse Relation Bank project, aimed at developing a large corpus annotated with discourse relations. We adopt the lexically grounded approach of the Penn Discourse Treebank, and describe our classification of Hindi discourse connectives, our modifications to the sense classification of discourse relations, and some crosslinguistic comparisons based on some initial annot...
متن کاملTowards an Annotated Corpus of Discourse Relations in Hindi
We describe our initial efforts towards developing a large-scale corpus of Hindi texts annotated with discourse relations. Adopting the lexically grounded approach of the Penn Discourse Treebank (PDTB), we present a preliminary analysis of discourse connectives in a small corpus. We describe how discourse connectives are represented in the sentence-level dependency annotation in Hindi, and disc...
متن کاملConcurrent Discourse Relations
The Penn Discourse Treebank (PDTB) was released to the public in 2008 and remains the largest corpus of manually annotated discourse relations — both relations that are signaled explicitly (e.g., by a coordinating or subordinating conjunction, or by a discourse adverbial or other construction) and ones that otherwise appear implicit. The Penn Discourse TreeBank also diverges from other discours...
متن کامل